Efficient and Scalable Sequence-Based XML Filtering

نویسندگان

  • Mariam Salloum
  • Vassilis J. Tsotras
چکیده

The ubiquitous adoption of XML as the standard of data exchange over the web has led to increased interest in building efficient and scalable XML publish-subscribe (pub-sub) systems. The central function of an XML-based pub-sub system is to perform XML filtering efficiently, i.e. identify those XPath expressions that have a match in a streaming XML document. In this paper, we propose a new sequence-based approach, which transforms both XML documents and XPath twig expressions into Node Encoded Tree Sequences (NETS). In terms of this encoding, we provide a necessary and sufficient condition for an XPath twig to represent a match in a given XML document. The proposed filtering procedure is based on a new subsequence matching algorithm devised for NETS, which identifies the set of matched queries free of false positives with a single scan of the XML document. Extensive experimental results show that the NETS method outperforms previous XML filtering approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Filtering and Routing in a Scalable XML-Based Publish-Subscribe System

This paper introduces YAK – a scalable contentbased publish-subscribe system. YAK employs XML documents and expressive XPath queries as the publication and subscription model. To achieve high scalability, it combines the advantages of content routing in existing publish-subscribe systems and the efficient query indexing technique in the context of XML filtering. The filtering and routing strate...

متن کامل

XML Filtering Using Dynamic Hierarchical Clustering of User Profiles

Information filtering systems constitute a critical component in modern information seeking applications. As the number of users grows and the information available becomes even bigger it is crucial to employ scalable and efficient representation and filtering techniques. In this paper we propose an innovative XML filtering system that utilizes clustering of user profiles in order to reduce the...

متن کامل

XFIS: an XML filtering system based on string representation and matching

Information-filtering systems constitute a critical component of modern information-seeking applications. As the number of users grows and the amount of information available becomes even bigger, it is imperative to employ scalable and efficient representation and filtering techniques. Typically, the use of eXtensible Markup Language (XML) representation entails profile representation with the ...

متن کامل

YFilter: Efficient and Scalable Filtering of XML Documents

Soon, much of the data exchanged over the Internet will be encoded in XML, allowing for sophisticated filtering and content-based routing. We have built a filtering engine called YFilter, which filters streaming XML documents according to XQuery or XPath queries that involve both path expressions and predicates. Unlike previous work, YFilter uses a novel NFA-based execution model. In this demon...

متن کامل

Value-based predicate filtering of XML documents

In recent years, publish–subscribe systems based on XML filtering have received much attention in ubiquitous computing environments and Internet applications. The main challenge is to process a large number of content against millions of user subscriptions. Several XML filtering systems focus on the efficient processing of structural matching of user subscriptions represented as XPath twig patt...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009